Observational Research

PSCI 2270 - Week 10

Georgiy Syunyaev

Department of Political Science, Vanderbilt University

November 7, 2023

Plan for this week


  1. Project Updates

  2. What if we cannopt randomize?

  3. Instrumental Variables: Colonial origins

  4. Regression Discontinuity: Finding close elections

  5. Differences-in-differences: Tabloid meida in UK

Any project updates?

What if we cannopt randomize?

Recap on experiments


  • Key idea: Randomization of the treatment makes the treatment and control groups “identical” on average

  • The two groups are expected to be similar in terms of all characteristics (both observed and unobserved)

    • Control group is similar to treatment group
    • Outcome in control group \(\approx\) what would have happened to treatment group if they did not receive treatment
    • vice versa
  • If we want to study effects of factor \(X\) on \(Y\) we would ideally want to run an experiment

    • But what do we do if we cannot randomize?
    • E.g. we do not have funding or phenomena we study, like elections, cannot be reasonably manipulated by researcher

Second-best: “Natural” experiments


  • Natural experiments: Effects of random or as-if random events/processes that occur outside of researcher’s control and are related to factors we are interested in
  • Uses observational data, but is better than correlation

    • Researchers can claim that “treatment” assignment is not related to other factors that can explain outcomes \(\Rightarrow\) Solve the issue of confounding!
  • What is the difference between random and as-if random?

    • Random: We can prove that the process/event was decided by lottery or analogous procedure and know the probabilities
    • As-if random: Process/event is not random, and we know it, but are unrelated (under assumptions) to other factors that are linked to our outcomes

As-if random example

  • John Snow and the study of cholera in London in 1854 (!)

Study



  • Theories on cholera have two hypotheses of transmission: water or air

    • How would we test this experimentally?
  • As-if random event:

    • In one area of London where outbreak happened in 1854 two companies supply water
    • In 1852 Lambeth moved upstream \(\Rightarrow\) possibly cleaner water
    • John Snow collected data on cholera deaths over 7 weeks

Results

Deaths from cholera epidemic over 7 weeks, 1854
Water Supply Company Number of Houses Deaths From Cholera Cholera Deaths per 10,000 Houses
Southwark and Vauxhall 40,046 1,263 315
Lambeth 26,107 98 37
Rest of London 256,423 1,422 59


  • Evidence: Strong support for the water hypothesis!
  • Was the exposure actually random?

    • Did companies choose strategically? No
    • Did people choose strategically? No
    • Are areas served by the company upstream different? No

How to find natural experiments



  • Random: Lotteries related to politically relevant processes; Timing of events

  • As-if random: Borders and historical events

  • Process of looking for natural experiments is always creative, but there are some commonly used types
  • Let’s brainstorm this in groups!

From Dunning (2012)

Making as-if random


  • As-if random events often need to use adjustments to standard difference-in-means analyses
  • Regression with covariates: Control for possible confounders \(\Leftarrow\) Football affects elections
  • Instrumental variables: Use before and after treatment comparison within the same unit \(\Leftarrow\) Colonial origins
  • Regression Discontinuity: Use some naturally occurring discontinuity (e.g. taxation) to compare units around it \(\Leftarrow\) Extremists win primaries
  • Differences-in-Differences: Compare trends between treated and untreated units even if we know there might be differences between them \(\Leftarrow\) Tabloid media in UK

Colonial origins of Comparative Development

Colonial origins


  • “The Colonial Origins of Comparative Development: An Empirical Investigation” Acemoglu, Johnson, and Robinson (2001) (17566 citations!)
  • Summary:

    • Observational study on the long-term effects of colonial institutions on country development
    • Heavy use of historical data
    • Institutions are not randomly assigned! Deal with this using instrumental variables
    • Colonial institutions persist over time and affect the income per capita today!

Theory



  • Question: Why are some countries rich and some are poor? (\(Y\))

  • Hypothesis: Extractive institutions (\(X\)) are persistent and can affect long-term development

  • Searching for as-if random assignment:

    • Why countries have different institutions?
    • Some countries were forced into extractive institutions by colonialists
    • Are there as-if random reasons for setting extractive institutions?

Instrumental variables


  • What is an instrument: Some factor that affects our main independent variable (early institutions) but is unlikely to affect our main outcome (current development) directly

    • \(Z \rightarrow X \rightarrow Y\), but \(Z \not\rightarrow Y\)
    • Why the last part is important? Excludabiolity!
    • What is the instrument in Acemoglu, Johnson, and Robinson (2001)
  • Instrument in Acemoglu, Johnson, and Robinson (2001):

    • (potential) settler mortality \(\Rightarrow\) settlements \(\Rightarrow\) early institutions \(\Rightarrow\) current institutions \(\Rightarrow\) current economic development

How do we use instruments


  • Instruments affects outcome through independent variable \(\Rightarrow\) Look only at the effect of independent variable on outcomes predicted by instrument
  • Instead of simple correlation we use two-stage procedure (2SLS)

    1. Predict the current instutitions using historical data on mortality/settlements
    2. Look at correlation between predicted institutions and economic development
  • To prove that instrument is valid researcher needs:

    • Show that instrument indeed predicts independent variable
    • Provide evidence that instrument is unlikely to affect outcome directly

How do they do it?



  • How do they operationalize economic development (\(Y\)), institutions (\(X\)) and settlements/mortality (\(Z\))?
  • How do they prove that settlements did not affect economic development through other channels?
  • What analyses do they run?

Main results

  • (potential) settler mortality/settlements \(\Rightarrow\) early institutions \(\Rightarrow\) current institutions

Main results

  • (potential) settler mortality/settlements \(\Rightarrow\) current institutions \(\Rightarrow\) current economic development

Main critiques


  • Different sources of data (some predicted mortality, some direct measures) \(\Rightarrow\) Selection bias

  • Data on troops (some in barracks and some on campaign) \(\Rightarrow\) Selection bias

  • Why use data on troops at all if it is not the same as settlers \(\Rightarrow\) Measurement validity

  • Can settlers affect current levels of development not through institutions? \(\Rightarrow\) Violation of excludability

  • \(\Rightarrow\) No Nobel Prize 😢

References

Acemoglu, Daron, Simon Johnson, and James A. Robinson. 2001. “The Colonial Origins of Comparative Development: An Empirical Investigation.” American Economic Review 91 (5): 13691401.
Dunning, Thad. 2012. Natural Experiments in the Social Sciences: A Design-Based Approach. Cambridge University Press.